chore: promote staging to staging-promote/ec04354c-23271447493 (2026-03-19 04:37 UTC)#1396
Conversation
* feat(telegram): support auto split large message * fix(telegram): strengthen split_message test assertion Replace word-by-word contains check with assert_eq! on rejoined chunks, ensuring split_message preserves content exactly. send_response is still used (lines 745, 753) so it is intentionally kept. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(telegram): add missing split_message tests and document limitations - Add test for sentence-boundary splitting - Add test for hard-cut on pathological input (no spaces) - Add test for multi-byte character safety (emoji) - Document CJK sentence punctuation limitation - Document trim behavior at chunk boundaries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: re-trigger CI with latest changes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Hans <me@hans00.me> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(testing): add FaultInjector framework for StubLlm (#1220) Adds a configurable fault injection framework for testing retry, failover, and circuit breaker behavior. The FaultInjector attaches to StubLlm and provides per-call control over failure type, timing, and sequencing. Components: - FaultType: maps to LlmError variants (RequestFailed, RateLimited, AuthFailed, InvalidResponse, IoError, ContextLengthExceeded, SessionExpired) - FaultAction: Succeed, Fail(FaultType), Delay(Duration) - FaultMode: SequenceOnce (play then succeed), SequenceLoop (repeat forever), Random (seeded xorshift64 PRNG for reproducibility) - FaultInjector: thread-safe (AtomicU32 counter + Mutex RNG) Integration: - StubLlm gains optional fault_injector field via with_fault_injector() - When set, takes precedence over should_fail/error_kind - Backward compatible: existing StubLlm usage unchanged Closes #1220 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(testing): address review feedback on FaultInjector - Remove redundant .abs() in random fault comparison - Extract check_faults() helper to DRY up StubLlm methods - Guard xorshift seed=0 (fixed point) by mapping to 1 - Add StubLlm integration test (stub_llm_fault_injector_sequence) - Remove dead seed field from FaultMode::Random - Move pub mod fault_injection to top of mod.rs - Add Debug impl for FaultInjector - Add empty_sequence_always_succeeds test - Add random_seed_zero_does_not_always_fail test * fix(testing): address #1233 review -- seed-0 bug, reset(), Debug derive - Store seed in FaultMode::Random so reset() can re-init the RNG - Add reset() method for test reproducibility (re-seeds RNG, zeros counter) - Strengthen seed=0 regression test to 100 iterations with stricter assertion - Add reset_restores_random_rng_from_stored_seed test - Debug impl and empty_sequence test were already present from prior commit Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ci: re-trigger CI with latest changes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: trigger new run with skip-regression-check label Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(testing): address PR #1233 review -- error_rate validation and edge cases - Validate error_rate is in 0.0..=1.0 and not NaN (panics on invalid input) - Fix error_rate==1.0 edge case: use <= instead of < so 1.0 always fails - Add regression tests for error_rate validation (NaN, negative, >1.0) - Add tests for error_rate boundary values (0.0 never fails, 1.0 always fails) - Add delay action test using tokio::time::pause() for deterministic timing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(self-repair): wire stuck_threshold, store, and builder (#647) Wire the previously dead-code fields in DefaultSelfRepair: - stuck_threshold: detect_stuck_jobs() now filters by duration, only reporting jobs stuck longer than the configured threshold - with_store(): wired in agent_loop.rs from AgentDeps.store for tool failure tracking via Database trait - with_builder(): wired from register_builder_tool() return value through AppComponents and AgentDeps for automatic tool rebuilding - tools: passed alongside builder for hot-reload logging Remove all #[allow(dead_code)] annotations. Add regression tests for threshold-based filtering (both above and below threshold). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add missing `builder` field to AgentDeps in gateway workflow harness After rebase onto staging, AgentDeps gained a `builder` field for self-repair tool rebuilding. The gateway workflow test harness was missing this field, causing CI compilation failure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: retrigger CI * fix: force CI refresh after path_routing_tests dedup * test: add E2E test for stuck job repair and tool rebuild cycle Tests the full self-repair flow requested in review: 1. Job transitions Pending -> InProgress -> Stuck 2. detect_stuck_jobs() finds it (zero threshold) 3. repair_stuck_job() recovers it back to InProgress 4. A broken tool is repaired via MockBuilder 5. Verify builder was invoked and repair succeeded Uses a MockBuilder (impl SoftwareBuilder) that returns successful BuildResult without requiring an LLM or filesystem. Uses libsql test database for the store (increment_repair_attempts, mark_tool_repaired). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(self-repair): measure stuck_duration from Stuck transition, not started_at - Use ctx.transitions to find the most recent Stuck transition timestamp instead of ctx.started_at (which reflects job start, not stuck time) - Fix StuckJob.last_activity to use stuck transition timestamp - Remove misleading "hot-reloaded into registry" log - Remove stray "// ci fix" comment in memory.rs - Add regression test: backdated started_at must not inflate stuck_duration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: re-trigger CI with latest changes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add type annotation to Ok(()) in test to resolve E0282 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Code reviewFound 1 issue:
ironclaw/tests/e2e_telegram_message_routing.rs Lines 183 to 201 in 8b15f8b Lines 183-201: AgentDeps construction missing |
Additional findingsFound additional issues in telegram message splitting tests:
ironclaw/channels-src/telegram/src/lib.rs Lines 432 to 441 in 8b15f8b Lines 432-441: The comment states "Rejoined chunks must equal the original text exactly", but this contradicts the documented behavior at line 76-77 that whitespace is dropped at split boundaries. |
Performance & Production IssuesFound additional performance concerns:
ironclaw/src/agent/self_repair.rs Lines 123 to 128 in 8b15f8b
ironclaw/src/agent/self_repair.rs Line 261 in 8b15f8b
|
…tion (#1400) - Add `builder: None` to AgentDeps initializer in e2e_telegram_message_routing test (field added in #712 but test not updated) - Update go_to_extensions() in test_telegram_hot_activation to navigate via settings tab -> extensions subtab (extensions tab was moved to settings) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: navigate telegram E2E tests to channels subtab wasm_channel extensions (like telegram) are now rendered in the Settings → Channels subtab, not the Extensions subtab. Update test_telegram_hot_activation to navigate there and use the correct card selector. Also mock /api/gateway/status which loadChannelsStatus fetches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: select telegram card by name, not first card in channels subtab Built-in channel cards (Web Gateway, HTTP, etc.) render first in the channels subtab content, so .first matches them instead of the telegram extension card. Select by has_text="Telegram" to target the correct card. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: make gateway_status_handler parameterizable in mock helper Address review feedback: extract default gateway status handler and accept an optional gateway_status_handler kwarg in mock_extension_lists for test flexibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…6242 chore: promote staging to staging-promote/b9e5acf6-23283208580 (2026-03-19 15:15 UTC)
…8580 chore: promote staging to staging-promote/3dcccc1e-23280048384 (2026-03-19 06:44 UTC)
e582166
into
staging-promote/ec04354c-23271447493
Auto-promotion from staging CI
Batch range:
428303af1128e7f124ad623fc1338393a4d06fcc..3dcccc1e64ea92fef2a44cf413b7cf974821da96Promotion branch:
staging-promote/3dcccc1e-23280048384Base:
staging-promote/ec04354c-23271447493Triggered by: Staging CI batch at 2026-03-19 04:37 UTC
Commits in this batch (22):
Current commits in this promotion (5)
Current base:
staging-promote/ec04354c-23271447493Current head:
staging-promote/3dcccc1e-23280048384Current range:
origin/staging-promote/ec04354c-23271447493..origin/staging-promote/3dcccc1e-23280048384builderfield and update E2E extensions tab navigation (fix: resolve CI failures from missing builder field and stale E2E selector #1400)Auto-updated by staging promotion metadata workflow
Waiting for gates:
Auto-created by staging-ci workflow